Airbnb boasts over a million listings in 34,000 cities, and according to data from Inside Airbnb, a independent data analysis website, listed about 36000 apartments in New York as of July 5, 2016. This data exploration sets out to visualize how airbnb operates in New York City. Airbnb’s presence in NYC has been clouded in controversy from the beginning, with law makers arguing that airbnb drive up rents for New York residents, as well as fascilitating a lot of illegal hosting actvities, all the while not paying any of the fees hotels are subjected to. Rent is drived when landlords decide to rather rent apartments to short-term guests at higher rates, compared to signing up tenants for yearlong leases. In a study conducted in 2014, The New York State Attorney General concluded that 72% of all units used as private short-term rentals on Airbnb during 2010 through mid-2014 appeared to violate both state and local New York laws. Most of these violatons come from the fact that the minimum required stay for entired home airbnb rentals have to be 30 days.Together with looking at illegal activity, I’m going to investigate the number of listings that are available for long extened periods throughout the year.
To start things off, let’s look at the distributions of Price, Reviews, Availability and Minimum Nights These plots primarily investigate how the 5 Boroughs difference when we compare the prices, reviews, and availability of each respectively. It make sense to think that Manhattan is the most expensive Borough, followed by Brooklyn or Staten Island.
First up, Lets take a look at how prices are distributed across the five boroughs.
We can clearly see that there are some outliers. The next plot excludes them and takes a closer look.
Next, I thought it was worth looking at the outliers. We can see that Staten Island has the highest amount of price outliers.
Finally, the best way to look at the prices is by taking the logarithm. This shows that Manhattan is the priciest followed by Brooklyn, then Staten Island, then Queens and lastly the Bronx.
The reviews_per_month variable gives us a good indication of how much activity goes on in each Borough. The more reviews a listing has per month, the more it gets rented out.
When we zoom into the outlier region, we can see that only handful of people have more than 8 reviews per month.
The last distribution I investigate is how availability differs across the five boroughs. What struck me as quite interesting is how much these listings are available on an annual basis. The distribution shows that listings are either available for a very short period, or essential for 365 days out of the year.
A quick look at estimated occupancy rates for Entire Home rentals in New York City (as at July 2, 2016) shows us that more than 6,000 entire homes are being rented for more than half the year, and most likely are no longer available on the rental or owner-occupied housing markets.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Now lets test the claim that 72% of listings are illegal. Under state law, it is illegal to lease most homes—with the exception of one- and two-family residences—for periods of less than 30 days when the owner or tenant is not present. This means that if an apartment is listed for less than thirty days and is listed as an “entire home”, it most likely is an illegal listing. The data shows that there is definitely something fishy going on.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The plot above shows that the Median Income is correlated with the Median Price of Private Room Listings. As the Median Income increases, starting from the bottom left of the plot, so does the Median Price of Private Room Listings. The correlation coefficient is 0.69.
The relationship between Median Entire Room Price and the Median Income for each neighbourhood showed a positive correlation of 0.61. Wealthier neigbourhoods list more expensive apartments. We again see that Brooklyn and Manhattan are the most expensive listings,
After I doing a log transformation, we observe a linear relationship, with a correlation of 0.632 between the log of median income and the percentage of apartment that are for entire room listings. I then coloured them by Borough. We see that Brooklyn and Manhattan have the highest percentage of entire apartment listings.
This section of my data exploration spatialy visualize airbnb’s listings in the city. I’m going to look at prices, activity levels, and which rental units are illegal under New York law.
The red on the map shows the expensive listings, and the yellow color denote the cheaper listings. As expected Manhattan has the higeshest concetratd regions of expensive listings, especially in areas like Soho and Chelsea and the Financial District. The next plot show how prices are distributed accross the neighbourhood.
The plot shows exactly what we could have guessed – Manhattan has the more expensive listings, on average, and the most expensive neighborhoods are concentrated near the financial Chelsea, Soho and the Financial district. The Bronx, as well as the outskirts of Brooklyn has the cheapest neighborhood on average.
Above we can see how the type of listing is distributed across NYC. We can clearly see that the more expensive regions are also the regions that list more entire apartment listings. Shared rooms are rarely seen.